A lot of efforts have been put on joint segmentation in recent years, which can be traced back to the early work by Rother~~\cite{Rother2006CVPR},
 which used color histogram matching to find common object in a pair of images. Later on, other kinds of features were also utilized
 to exploit the relationship between image foregrounds, such as SIFT~\cite{Mukherjee2011}, saliency~\cite{Chang2011cosaliency}, and Gabor features~\cite{Hochbaum2009}.
To address the cosegmentation of multiple images, Joulin et al. formulated the cosegmentation task as a discriminative clustering problem by clustering the image pixels into foreground and background clusters~\cite{Joulin2010}.
\cite{Vicente2011} proposed to extract objects from a group of images by using the object recognition scheme to generate a pool of object-like segmentations. 
\cite{Chang2011cosaliency} established an MRF optimization model and the data term was a cosaliency prior as the hint about possible foreground locations.
The proposed model can thus be optimally solved by graph cuts.
The common objects are corresponded by region matching to exploit inter-image information, and the appearance distributions of both the foreground and the background can be jointly estimated for better segmentation~\cite{Rubio2012}.

Multiple object cosegmentation has only been explored in recent years.
To handle multiple object classes, \cite{Kim2011ICCV} modeled the segmentation task as a temperature maximization on anisotropic heat diffusion. The submodular property of the formulation guaranteed a constant factor approximation to the optimal solution.
Joulin et al. proposed an effective energy-based objective that combines a spectral-clustering term with a discriminative one, and the objective can be optimized using an efficient expectation-minimization algorithm~\cite{Joulin2012}. 
Both works can handle multiple object classes, however, they still assume all $K$ objects appear in each image, which is not realistic in real application.
To segment images containing an unknown subset of objects, Kim et al. proposed to alternate between foreground modeling and region assignment~\cite{Kim2012CVPR}. The foreground modeling step learns the appearance models of $K$ foregrounds and the background, and the region assignment step is formulated as welfare maximization in combinatorial auction. \cite{Li2013CSVT} proposed to discover the unknown object-like proposals by ensemble clustering and the final cosegmentation problem is solved by the multi-label energy minimization.

